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Abstract, oj-languages are becoming more and more relevant nowadays when 
most applications are "ever-running". Recent literature, mainly under the moti- 
vation of widening the application of model checking techniques, extended the 
analysis of these languages from the simple regular ones to various classes of 
languages with "visible syntax structure", such as visibly pushdown languages 
(VPLs). Operator precedence languages (OPLs), instead, were originally defined 
to support deterministic parsing and, though seemingly unrelated, exhibit inter- 
esting relations with these classes of languages: OPLs strictly include VPLs, en- 
joy all relevant closure properties and have been characterized by a suitable au- 
tomata family and a logic notation. 

In this paper we introduce operator precedence oj-languages (ojOPLs), investi- 
gating various acceptance criteria and their closure properties. Whereas some 
properties are natural extensions of those holding for regular languages, others 
required novel investigation techniques. Application-oriented examples show the 
gain in expressiveness and verifiability offered by ojOPLs w.r.t. smaller classes. 

Keywords: oj-languages. Operator precedence languages. Push-down automata. 
Closure properties. Infinite-state model checking. 

1 Introduction 

Languages of infinite strings, i.e. (^-languages, have been introduced to model nonter- 
minating processes; thus they are becoming more and more relevant nowadays when 
most applications are "ever-running", often in a distributed environment. The pioneer- 
ing work by Biichi and others investigated their main algebraic properties in the con- 
text of finite state machines, pointing out commonalities and differences w.rt. the finite 
length counterpait ||4il61 . 

More recent literature, mainly under the motivation of widening the application of 
model checking techniques to language classes as wide as possible, extended this analy- 
sis to various classes of languages with "visible structure", i.e., languages whose syntax 
structure is immediately visible in their strings: parenthesis languages, tree languages, 
visibly pushdown languages (VPLs) ([T| are examples of such classes. 

Operator precedence languages, instead, were defined by Floyd in the 1960s with 
the original motivation of supporting deterministic parsing, which is trivial for visible 
structure languages but is crucial for general context-free languages such as program- 
ming languages |7|, where structure is often left implicit (e.g. in arithmetic expres- 
sions). Recently, these seemingly unrelated classes of languages have been shown to 
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share most major features; precisely OPLs strictly include VPLs and enjoy all the same 
closure properties ||6]. This observation motivated characterizing OPLs in terms of a 
suitable automata family ifTOl and in terms of a logic notation fTT], which was missing 
in previous literature. 

In this paper we further the investigation of OPLs properties to the case of infi- 
nite strings, i.e., we introduce and study operator precedence (^-languages (wOPLs). As 
for other families, we consider various acceptance criteria, their mutual expressiveness 
relations, and their closure properties. Not surprisingly, some properties are natural ex- 
tensions of those holding for, say, regular languages or VPLs, whereas others required 
different and novel investigation techniques essentially due to the more general man- 
aging of the stack. These closures and the decidability of the emptiness problem are 
a necessary step towards the possibility of performing infinite-state model checking. 
Simple application-oriented examples show the considerable gain in expressiveness and 
verifiability offered by wOPLs w.rt. previous classes. 

The paper is organized as follows. The next section provides basic concepts on oper- 
ator precedence languages of finite-length words and on operator precedence automata 
able to recognize them. Section|3]defines operator precedence automata which can deal 
with infinite strings, analyzing various classical acceptance conditions for w-abstract 
machines. Section |4] proves the closure properties they enjoy w.r.t typical operations 
on w-languages and shows also that the emptiness problem is decidable for these for- 
mahsms. Finally, Section|5]draws some conclusions. 

2 Preliminaries 

Operator precedence languages 116 ]71 have been characterized in terms of both a gener- 
ative formalism (operator precedence grammars, OPGs) and an equivalent operational 
one (operator precedence automata, OPAs, named Floyd automata or FAs in 1 10|), but 
in this paper we consider the latter, as it is better suited to model and verify nonterminat- 
ing computations of systems. We first recall the basic notation and definition of operator 
precedence automata able to recognize words of finite length, as presented in [10|. 

Let Z be an alphabet. The empty string is denoted e. Between the symbols of the 
alphabet three types of operator precedence (OP) binary relations can hold: yields prece- 
dence, equal in precedence and takes precedence, denoted <, = and > respectively. We 
use a special symbol # not in E to mark the beginning and the end of any string. This 
is consistent with the typical operator parsing technique that requires the lookback and 
lookahead of one character to determine the next action to perform [8 1. The initial # can 
only yield precedence, and other symbols can only take precedence on the ending #. 

Definition 1. An operator precedence matrix (0PM) M over an alphabet E is a\E D 
{#)| X liT U {#}| array that with each ordered pair (a,b) associates the set Mai, of OP 
relations holding between a and b. M is conflict-free ijf'ia,b € E, \Mab\ < 1- We call 
{E, M) an operator precedence alphabet if M is a conflict-free 0PM on E. 

Between two OPMs M\ and M2, we define set inclusion and union: 

Ml c M2 if Va, b : (M,U £ {M2)ab, M ^ Mi UM2 if Va, b : M^b ^ iMiUU{M2)ab 
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If Mab = {°}, with o e {<, =, >} ,we write aob. For u,v we write uov\fu = xa 
and V = by with aob. Two matrices are compatible if their union is conflict-free. A 
matrix is complete if it contains no empty case. 

In the following we assume that M is =-acyclic, which means that ci = C2 - • • ■ - 
Ck - c\ does not hold for any c\,C2, . . . ,C]!, & E,k>\. 

Definition 2. A nondeterministic operator precedence automaton ( OPA) is a Uiple A = 
{Z,M,Q,I,F,6)where: 

- (Z, M) is a precedence alphabet, 

- Q is a set of states (disjoint from E), 

- I Q Qisa set of initial states, 

- F Q Q is a set of final states, 

- 6 : Qx U Q) 2^ is the transition fi^nction. 

The transition function can be seen as the union of two disjoint functions: 

An OPA can be represented by a graph with Q as the set of vertices and U 2 as the 
set of edge labels: there is an edge from state q to state p labeled by a e 2" if and only if 
P 6 Spushiq, a) and there is an edge from state q to state p labeled by r € 2 if and only 
if p € dfiushiq, r). To distinguish flush transitions from push transitions we denote the 
former ones by a double arrow. 

To define the semantics of the automaton, we introduce some notation. We use let- 
ters p, q, Pi, qi, ... for states in Q and we set iT' = [a' | a 6 2"}; symbols in £' are called 
marked symbols. 

Let r he (X U £' U {#}) x Q; we denote symbols in F as [a q], [a' q], or [# q], 
respectively. We set symbol([a q]) = symbol{[a' q\) = a, symbol([# q]) = #, and 
stateila q]) = state([a' q]) = state([# q]) = q. Given a string yS = B1B2 .. .B„ with 
Bj e F, we set stateifi) - state(B„). 

A configuration is any pair C = (fi , w), where jS = B\B2 ...B„ € F*, symbol{B\) = 
#, and w = a\a2...am 6 E*#. A configuration represents both the contents jS of the 
stack and the part of input w still to process. 

A computation (run) of the automaton is a finite sequence of moves C \- C\\ there 
are three kinds of moves, depending on the precedence relation between symbol{B„) 
andai: 

push move: if symbol(B„) = a\ then C\ - {{i{a\ q] , a2 . . . a„,), with q e 6push(state(fi), ai); 
mark move: if symbol(B„)< a\ then C\ - (J3[ai' q] , a2. ■ ■ a^}, with (7 e 5 p„sh{state(fi), ai); 
flush move: if symboliB„) > ai then let i the greatest index such that symbol(Bi) e X' 
(such index always exists). Then Ci = {B1B2 . . .Bi-2[.symbol{Bi-\) q\ , a\a2 ■ ..am), 
with q e 6fiushistateiBn), state(Bi-i)). 

Push and mark moves both push the input symbol on the top of the stack, together 
with the new state computed by Spush', such moves diff'er only in the marking of the 
symbol on top of the stack. The flush move is more complex: the symbols on the top of 
the stack are removed until the first marked symbol (included), and the state of the next 
symbol below them in the stack is updated by d flush according to the pair of states that 
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delimit the portion of the stack to be removed; notice that in this move the input symbol 
is not consumed and it remains available for the following move. 

Finally, we say that a configuration [# qi] is starting if qi e I and a configuration 
[# qp] is accepting if qf e F. The language accepted by the automaton is defined as: 

L(A) - I <[# q,] , x#) h <[# qp] , #), q, € /, £ f} . 

Remark 1. The assumption on the =-acyclicity has been introduced in previous liter- 
ature 1161 101 to prevent the construction of operator precedence grammars with un- 
bounded length of production's right hand sides (rh.s.). Correspondingly, in presence of 
= -cycles of an OPM, an OPA could be compelled to an unbounded growth of the stack 
before applying a flush move. The =-acyclicity hypothesis could be replaced by the 
weaker restriction of production's rh.s. of bounded length in grammars and a bounded 
number of consecutive push moves in automata, or could be removed at all by allow- 
ing such unbounded forms of grammars - e.g. with regular expressions as rh.s.- and 
automata. In this paper we accept a minimal loss of generation^ power and assume the 
simplifying assumption of =-acyclicity. 

An OPA is deterministic when / is a singleton and 5push(?, a) and 5flush(?, p) have at 
most one element, for every q,p e Q and a e S. 

An operator precedence transducer can be defined in the usual way as a tuple T = 
{E, M, Q, I, F, 0, 6, if) where E, M, Q, I, F are defined as in Definition |2] O is a finite 
set of output symbols, the transition function 6 and the output function rj are defined 
by {5, T]} : Qx(E U Q) ^ '?f(Q x O*), where Tf denotes the set of finite subsets of 
(Q X O*), and (6, if) can be seen as the union of two disjoint functions, ((Jpush, %usti) '■ 
QxE^ TAQ X O*) and (Jflush, ?7flush> : 2 x g ^ ^AQ x O*). 

A configuration of the transducer is denoted {y6 , w) i z, where C = (yS , w) is 
the configuration of the underlying OPA and the string after X represents the output of 
the automaton in the configuration. The transition relation h is naturally extended from 
OPAs, concatenating the output symbol produced at each move with those generated in 
the previous moves. The transduction t ; /* — > generated by T is defined by 

r(x) = \z I <[#^/] , x#) i e P , #) i z,<?/ E Uq^ € 

Example 1. As an introductory example, consider a language of queries on a database 
expressed in relational algebra. We consider a subset of classical operators (union, in- 
tersection, selection cr, projection n and natural join m). Just like mathematical oper- 
ators, the relational operators have precedences between them: unary operators cr and 
TT have highest priority, next highest is the ''multiplicative'^ operator m, lowest are the 
''additive" operators U and n. 

Denote as T the set of tables of the database and, for the sake of simphcity, let E be 
a set of conditions for the unary operators. The OPA depicted in Figure [1] accepts the 
language of queries without parentheses on the alphabet E - TU {m, U, n} U {cr, tt} x E, 



^ An example language that cannot be generated with an =-acyclic OPM is the following: C = 
[a^bcf I n > 0) U {b"(ca)" \ n > 0) U (c"(ai)" | w > 0) 
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where we use letters A,B,R . . . for elements in T and we write o-gxpi for a pair (cr, expr) 
of selection with condition expr (similarly for projection TTexpr). The same figure also 
shows an accepting computation on input A U B x C x n^^^^D. 

Notice that the sentences of this language show the same structure as arithmetic 
expressions with prioritized operators and without parentheses, which cannot be repre- 
sented by VPAs due to the particular shape of their OPM |l6] . 
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Fig. 1: Automaton, precedence matrix and example of computation for language of Ex- 
ample [1] 



Let (2", M) be a precedence alphabet. 

Definition 3. A simple chain is a word aoflifl2 • ■ • Ona„+i, written as {"°a\a2 ■ ■ ■ a„""^' ), 
such that: aq e 2" U {#}, e i^/or every / : 1 < / < n -H 1, Mo^o^^^i ^ 0, ant/ oq < oi = 
a2...a„-i ± a„ >a„+i. 

A composed chain is a word 00X001X102 • ■ • a„Xna„+i, where ("''0102 . . . a,,""*' ) is a 
simple chain, and Xj e 2^* is the empty word or is such that {"'x"'*^ ) is a chain (simple 
or composed), for every i : < i < n. Such a composed chain will be written as 

<"»X0OiXiO2 . . . OnXn""*' ). 

A word w over (Z, M) is compatible with M iff for each pair of consecutive let- 
ters c, d in w it holds that Med + 0> cind for each factor x of #w# such that x = 
aoxoOiXi02 . . .o„x„o„+i where oo < oi = 02 . . .o„_i = o„ > a„+i and Xj e Z* is the 
empty word or is such that {"'x"'*^ ) is a chain ( simple or composed) for every < i < n, 
it holds that Ma„a„+, ^ 0- 

Definition 4. Let A be an operator precedence automaton. A support for the simple 
chain (""0102 . . . On""*' ) is any path in A of the form 

— ^ qo — > qi — > ... — > qn-i — > q„ q„+i (1) 

Notice that the label of the last (and only) flush is exactly qo, i.e. the first state of the 
path; this flush is executed because of relation a„ > a„+\. 
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A support for the composed chain (""jcofli JCifla ■ ■ ■ OnXn""*^ ) is any path in A of the form 

flo xo , CI] xi , fl2 a„ x„ , ?o 

— > ?{) ?() — ^ qy — > • ■ ■ — ^ qn-^ q„^ qn+i (2) 

where, for every i : < i < n: 

Qi A", 

- if xi + s, then — > qi ^ q. is a support for the chain {"'x"'*'}, i.e., it can be 

decomposed as — ^ qi ^ q'-' q[. 

- if Xi — s, then q'. — qi. 

Notice that the label of the last flush is exactly q'^. 

The chains fully determine the structure of the parsing of any automaton on a word 
compatible with M, and hence the structure of the syntax tree of the word. Indeed, if 
the automaton performs the computation {[a qa] , xb) h {[a q\ , b) on a factor axb, 
then {"x'^} is necessarily a chain over {E,M) and there exists a support like (|2) with 
X — xofli . . . a„x„ and q„+i — q. 



3 Operator precedence tj-languages and automata 

Let us now generalize operator precedence automata to deal with words of infinite 
length and to model nonterminating computations. 

Traditionally, w-automata have been classified on the basis of the acceptance con- 
dition of infinite words they are equipped with. All acceptance conditions refer to the 
occurrence of states which are visited in a computation of the automaton, and they 
generally impose constraints on those states that are encountered infinitely (or also 
finitely) often during a run. Classical notions of acceptance (introduced by Biichi Q, 
Muller fl2|, Rabin fl4\, Streett fTSi) can be naturally adapted to w-automata for oper- 
ator precedence languages and can be characterized according to a peculiar acceptance 
component of the automaton on w-words. We first introduce the model of nondeter- 
ministic Biichi-operator precedence (^-automata with acceptance by final state; other 
models are presented in Section |33] 

As usual, we denote by 2"" the set of infinite-length words over £. Thus, the symbol 
# occurs only at the beginning of an w-word. Given a precedence alphabet (2", M), the 
definition of an w-word compatible with the OPM M and the notion of syntax tree of 
an infinite-length word are the natural extension of these concepts for finite strings. 

Definition 5. A nondeterministic Biichi-operator precedence w-automaton ( wOPBAj is 
given by a tuple A — {Z, M, Q, I, F, 6), where E, Q, I, F, 6 are defined as for OPAs; the 
operator precedence matrix M is restricted to be a\E V) {#}| X \S\ array, since oj-words 
are not terminated by the delimiter #. 

Configurations and {infinite) runs are defined as for operator precedence automata 
on finite-length words. Then, let ' be a shorthand for "there exist infinitely many 
i" and let § be a run of the automaton on a given word x e Define /«(§) - {q & Q \ 
3"i (fii , Xi) € § with state(J5i) - q] as the set of states that occur infinitely often at the 
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top of the stack of configurations in S. A run § of an wOPBA on an infinite word x E 2"'^ 
is successful iff there exists a state qj & F such that qj e In(S). A accepts x e E'^ iff 
there is a successful run of A on x. Furthermore, let the w-language recognized by A 
be L{A) = {x e I yi accepts x). 

Operator precedence uj-tmnsducers are defined in the natural way as for finite- 
length words. 

3.1 Some examples 

Example 2. Consider a software system which is supposed to work forever and may 
serve interrupt requests issued by different users. The system can manage three types 
of interrupts with different levels of priority, that affect the order by which they are 
served by the system: pending lower priority interrupts are postponed in favor of higher 
priority ones. 

This policy can be naturally specified by defining an alphabet of letters for ordinary 
procedures and for interrupt symbols, and by formalizing the priority level among the 
interrupt requests as OP relationships in the precedence matrix of an operator prece- 
dence automaton on infinite-length words: an interrupt yields precedence (<) to higher 
priority ones, which will be handled first, and takes precedence (>) on lower priority 
requests, whose processing is then suspended. Figure |2] shows an wOPBA with ac- 
ceptance condition by final state which models the behavior of a system which may 
execute two functions denoted a and b, that may be suspended by interrupts of types 
into, int\ and int2 with increasing level of priority. Calls and returns of the procedures 
are denoted calla,calli,,reta,reti,. A request is actually served as soon as the corre- 
sponding interrupt symbol is flushed from the top of the stack. Figure |2] also presents 
the precedence matrix and an example computation of the system for the infinite string 
callgCallhrethcallhintiintiintoreth ■ ■ ■ 

Several variations of the above policy can be specified as well by similar wOPBAs; 
e.g., we might wish to formalize that high priority interrupts flush pending calls, whereas 
lower priority ones let the system resume serving pending calls once the interrupt has 
been served. We might also introduce an explicit symbol to formalize the end of serving 
an interrupt and specify that some events are disabled while serving interrupts with a 
given priority, etc. 

Example 3. Operator precedence automata on infinite-length words can also be used 
to model the run-time behavior of database systems, e.g., for modeling sequences of 
users' transactions with possible rollbacks. Other systems that exhibit an analogous 
behavior are revision control (or versioning) systems (such as subversion or git). As an 
example, consider a system for version management of files where a user can perform 
the following operations on documents: save them, access and modify them, undo one 
(or more) previous changes, restoring the previously saved version. 

The following alphabet represents the user's actions: sv (for save), wr (for write, 
i.e. the document is opened and modified), ud (for a single undo operation), rb (for a 
rollback operation, where all the changes occurred since the previously saved version 
are discarded. 
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Fig. 2: Automaton, precedence matrix and example of computation for language of Ex- 
ample |2] 



An wOPBA which models the traces of possible actions of the user on a given 
document is a single-state automaton {Z, M, {q}, [q], [q], 6), where E - [sv, rb, wr, ud], 
dpushiq, a) - q,Va e X and dnushiq, q) - q and its 0PM is: 
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Furthermore, one can even consider some specialized models of this system, that 
represent various patterns of user behavior For instance, one in which the user regularly 
backs her work up, so that no more than changes which are not undone (denoted wr 
as before) can occur between any two consecutive checkpoints sv (without any rollback 
rb between them). Figure [3] shows the corresponding wOPBA with N - 2, with the 
same OPM M. 
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?4 




ud 



Fig. 3: wOPBA of Example [3] with N ^ 2. 



States 0, 1 and 2 denote respectively the presence of zero, one and two unmatched 
changes between two symbols sv. All states of the wOPBA final. 

An example of computation on the string sv wr ud rb sv wr wr ud sv wr rbwrsv... 
is shown in Figure |4] 

3.2 Operator precedence w-languages and visibly pushdown w-languages 

Classical families of automata, like Visibly Pushdown Automata |1|, imply several 
restrictions that hinder them from being able to deal with the concept of precedence 
among symbols. These restrictions make them unsuitable to define systems Uke those 
of Section im and in general all paradigms based on a model of priorities. 

Noticeably, VPAs on infinite-length words are significantly extended by the class 
of OPAs, since VPAs introduce a rigid partitioning on the alphabet symbols which 
heavily constrains the possible relationships among them: any letter cannot assume a 
role dependent on the context (as an interrupt which can yield or take precedence over 
another one depending on the mutual priority), and this restriction has some conse- 
quences on their expressive power w.rt wOPLs. Actually, as it happens for finite-word 
languages |6 10 1, one can prove the following result. 

Tlieorem 1. The class of languages accepted by wBVPA fnondeterministic Biichi vis- 
ibly pushdown w-automataj is a proper subset of that accepted by wOPBA. 
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rb sv wr wr ud sv wr rb wr sv . . 


■ I 


push 




sv wr wr ud sv wr rb wr sv . . 


\ 

■ I 


nusn 




sv wr wr ud sv wr rb wr sv . . 


■ ) 


mark 




wr wr ud sv wr rb wr sv . . 


•> 


mark 


<[#go][iV 0][vvr' 1] 


wr ud sv wr rb wr sv . . 


•> 


mark 


{[#qo\[sv' Q][wr' \][wr' q^} 


ud sv wr rb wr sv . . 


•> 


push 


{[# qo\[sV Q][wr' l][wr' qi\[ud q4\ 


sv wr rb wr sv . . 


•> 


flush 


<[#go][«" 0][wr' 1] 


sv wr rb wr sv . . 


•> 


mark 


<[#go][iV 0][vi'r' l][5v' 0] 


wr rb wr sv . . 


•> 


mark 


{[#qo\[sv' Q][wr' l][sv' Q\[wr' 1] 


rb wr sv . . 


•> 


flush 


<[# qoMsv' Q][wr' l][sv' ^2] 


rb wr sv . . 


•> 


push 


<[# qo][sv' mwr' l][sv' q^^b g,] 


wr sv . . 


•> 


flush 


<[#go][«" 0][wr' 1] 


wr sv . . 


•> 


mark 


{[#qoMsv' 0][wr' l][wr' 2] 


sv . . 


•> 


mark 


<[# qo\[sv' 0][wr' l][vi'r' 2][.?v' 0] 




•> 


Fi 


g. 4: Example of computation for the specialized system of Example[3] 





The behavior of version management systems like those in Example|3]too cannot be 
modeled by <y VPAs since the shape of their matrix allows only one-to-one relationships 
between matching symbols (as do-undo actions on a single change, denoted wr and ud), 
whereas the return to a previous version, undoing all the possible sequence of changes 
performed in the meanwhile, is represented by a many-to-one relationship (holding 
among symbols wr and a single rb). 

3.3 Other automata models for operator precedence w-languages 

There are several possibilities to define other classes of w-languages. In order to do that 
we introduce the following general definition. 

Definition 6. A nondeterministic operator precedence w-automaton fwOPAj is given 
by a tuple A — {E, M, Q, I, IF, 6), where E, Q, 1, 6 are defined as for OPAs; the operator 
precedence matrix M is restricted to be a \E D {#)| X \E\ array, since oj-words are not 
terminated by the delimiter #; J is an acceptance component, distinctive of the class 
(Biichi, Muller,. . . ) the automaton belongs to. Deterministic wOPA are specified as for 
operator precedence automata on finite-length words. 

A run is successful if it satisfies an acceptance condition on 3^ based on a specific 
recognizing mode. A accepts x e E" iff there is a successful run of A on x. Furthermore, 
let the w-language recognized by A be L(A) - {x eE'^ \A accepts x). 
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When 9^ is a subset F Q Q, Definition |6] leads to Definition |5] of BUchi-operator 
precedence w-automaton; wOPBEA is a variant of wOPBA obtained when using the 
following acceptance condition: a word is recognized if the automaton traverses final 
states with an empty stack infinitely often. Formally, a run § of an wOPBEA is suc- 
cessful iff there exists a state qj e F such that configurations with stack [# qj] occur 
infinitely often in §. 

Proposition 1. ^fwOPBEAj c ^fwOPBAj. 

Proof. The inclusion is trivial by definition. To see why it is proper, one can consider for 
instance the language Liepbdd (studied in 1 1 1) consisting of infinite words on the alphabet 
[a, fl), which can be interpreted as a language of calls and returns of a procedure a, with 
the further constraint that there is always a finite number of pending calls. A nondeter- 
ministic wOPBA with final state acceptance condition can nondeterministically guess 
which is the prefix of the word containing the last pending call, and then recognizes the 
language (LDyck(fl, a))'^ of correctly nested words. An wOPBEA cannot recognize this 
language. In fact, it may accept a word iff it reaches infinitely often a final configuration 
with empty stack during the parsing. However, the automaton is never able to remove all 
the input symbols piled on the stack since it cannot flush the pending caUs interspersed 
among the correctly nested letters a, otherwise it would either introduce conflicts in the 
OPM or it would not be able to verify that they are in finite number 

The classical notion of acceptance for Muller automata can be likewise defined 
for wOPAs. 

Definition 7. A nondeterministic Muller-operator precedence automaton fwOPMAj is 
an wOPA {E,M, Q, 1,3^,6) whose acceptance component is a collection of subsets of 
Q, y —7 Q 2^, called the table of the automaton. 

A run S of an wOPMA on an infinite word x e is successful iffln(§) e T, i.e. the set 
of states occurring infinitely often on the stack is a set in the table T. 

In the case of classical finite-state automata on infinite words, nondeterministic 
Biichi automata and nondeterministic Muller automata are equivalent and define the 
class of w-regular languages. Traditionally, Muller automata have been introduced to 
provide an adequate acceptance mode for deterministic automata on w-words. In fact, 
deterministic Biichi automata cannot recognize all w-regular languages, whereas deter- 
ministic Muller automata are equivalent to nondeterministic Biichi ones 

For VPAs on infinite words, instead, the paper |1 1 showed that the classical deter- 
minization algorithm of Biichi automata into deterministic Muller automata is no longer 
valid, and deterministic Muller wVPAs are strictly less powerful than nondeterministic 
Biichi wVPAs. A similar relationship holds for wOPAs too. 

The relationships among languages recognized by the different classes of opera- 
tor precedence w-automata and visibly pushdown w-languages are summarized in the 
structure of Figure |5] where wDOPBA and wDOPMA denote the classes of determinis- 
tic wOPBAs and deterministic wOPMAs respectively. The detailed proofs of the strict 
containment relations holding among the classes in Figure|5]are presented in 1 13, Chap- 
ter 4] and we do not report them here again for space reasons. In the following sections 
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we provide the proofs regarding the relationships between those classes which are not 
comparable (i.e., those linked with dashed lines in the figure), which are not included 
in|fT3l. 



Ha)OPBA) = £(aiOPMA) 




£(tjOPBEA) 2 £(ajDOPMA) £(a;BVPA) 

£(a;DOPBA) 



Fig. 5: Containment relations for wOPLs. Solid lines denote strict inclusions; dashed 
lines link classes which are not comparable. 



3.4 Comparison between ^.(wBVPA) and £.(6>OPBEA) 
£(wBVPA) and £(wOPBEA) are not comparable. 

- £(wBVPA) g £(wOPBEA) 

Consider the language Liepbdd (studied in ||T]) consisting of infinite words on the 
alphabet {a, a], which can be interpreted as a language of calls and returns of a 
procedure a, with the further constraint that there is only a finite number of pending 
calls. An wBVPA can accept this language: it nondeterministically guesses which 
is the prefix of the string containing the last pending call, and it can subsequently 
recognize the language {Luyckio, a))" of correctly nested words. 
An wOPBEA automaton cannot recognize this language, as seen in the proof of 
Proposition [1] 

- £(wBVPA) 2 £(wOPBEA) 

Consider the system introduced in Example 4 of IfTOl which describes the stack 
management of a programming language able to handle nested exceptions. No 
wBVPA can express the language of the infinite computations of this system be- 
cause of the shape of the precedence matrix, which is not compatible with the ma- 
trix of a VPA. 

The automaton presented in the figure of this Example 4, which is able to recognize 
this language, instead, can be interpreted as an wOPBEA. 

Note also that the same automaton can be considered as an wOPBA: it is determin- 
istic by construction, so there exists also an wDOPBA able to model this system, 
and £(wBVPA) ^ £(wDOPBA). Moreover, since £(wDOPBA) c £(wDOPMA), 
an automaton iwDOPMA can recognize it too; thus £(wBVPA) ^ £(wDOPMA). 
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3.5 Comparison between £.(<jBVPA) and £.(6jDOPBA) 
£(wBVPA) and £(wDOPBA) are not comparable. 

- £(wBVPA) ^ £(wDOPBA) 

Consider the language on the alphabet E - {a, h\. 

L\-{a^E'^:a contains finitely many letters a } (3) 

It can be recognized by an wBVPA, but no wDOPBA can accept it. 

In fact, an wBVPA can recognize words of L\ finding nondeterministically the last 

letter a in a word and then reading suffix h" . 

The proof that no wDOPBA can recognize L\ resembles the classical proof (see 
e.g. iTSIl) that deterministic Biichi finite-state automata are strictly weaker than 
nondeterministic Biichi finite-state ones. We outline here the proof for the sake of 
completeness. 

Assume that there exists an wDOPBA 2? which recognizes L\ . 
Notice that, in general, according to the definition of push/mark/flush moves of an 
operator precedence automaton (finite or w), given any configuration C = (/6 , w), 
the state piled up at the top of the stack with a transition (fi , w) h , w'), 
namely state(J3'), is exactly the state reached by the automaton on its state-graph. 
Thus, during a run on a word x € Z"", configurations with stack j8, with stateipi) e F 
occur infinitely often iff the automaton visits infinitely often states in F in its graph. 
Now, the infinite word x - b" belongs to L\, since it contains no (and then a finite 
number of) letters a. Then, there exists a unique run of 23 on this string which visits 
infinitely often final states. Let Z?"' be the prefix read by 23 until the first visited final 
state. 

But also b"'ab" belongs to Li, hence there exists a final state reached reading the 
prefix b"'ab"^-, for some e A^. 

In general, one can find a sequence of finite words b"'ab"- . . . ab"'^, (k > 1) such that 
the automaton has a unique run on them, and for each such runs it reaches a final 
state (placing it at the top of the stack) after reading every prefix ab"- . . . ab"', V / < 
k. Therefore, there exists a (unique) run of A on the w-word w - b"'ab"- . . . such 
that A visits infinitely often final states, and thus reaches infinitely often configura- 
tions C - (J3 , w) with state(J3) e F. 

However, w cannot be accepted by 23 since it contains infinitely many letters a, and 
this is a contradiction. 

- £(wBVPA) 2 £(wDOPBA) 
See Section[34l 

3.6 Comparison between ^.(wBVPA) and £(6>DOPMA) 
£(wBVPA) and £(wDOPMA) ai-e not comparable. 

- £(wBVPA) ^ £(wDOPMA) 

No wDOPMA can recognize the language Lrepbdd (the proof can be found in ifTSl '). 
whereas an wBVPA can accept it (see O]). 

- £(wBVPA) 2 £(wDOPMA) 
See Section[14l 
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3.7 Comparison between £.(wOPBEA) and £(wDOPBA) 

-C(wOPBEA) and £(wDOPBA) are not comparable. 

- £(wOPBEA) ^ i:(wDOPBA) 

Language Li (Equation|3]l cannot be recognized by an wDOPBA (see Section [33]) . 
but there exists an wOPBEA accepting it, depicted in Figure|6]along with its prece- 
dence matrix (where o e {<, >} can be any precedence relation): 





a 


b 


a 


o 


> 


b 


o 


> 


# 


< 


< 



Fig. 6: (wOPBEA recognizing Li - {a e E' 
OPM. 



a,b b 




: a contains finitely many letters a] and its 



- £(wOPBEA) 2 £(wDOPBA) 

Let L2 be the language a^L-i^ with L3 = [a^b'' | A: > 1} and where, in general, for 
a set of finite words L c A*, one defines - {a & A'^ \ a = wqWi . . . with w, 6 
L for / > 0}. 

No wOPBEA can recognize this language. Indeed, words in L3 can be recognized 
only with the OPM M depicted in Figure |7] where o e ±, >} can be any prece- 
dence relation: clearly, using any other OPM there exist words in L3 and L2 = a^L^" 
which could not be recognized. Thus, because of the OP relation a <«, an wOPBEA 
piles up on the stack the first sequence of a word and cannot remove it afterwards; 
hence it cannot empty the stack infinitely often to accept a string in L2- 





a 


b 


a 


< 




b 





> 


# 


< 





Fig. 7: OPM for language L2 of SectionllT] 



There is, however, an wDOPBA that recognizes such a language (Figure [8]l. Inci- 
dentally notice that, since C(ll>DOPBA) c £(wDOPMA), an automaton wDOPMA 
can recognize it too; thus £(wOPBEA) J X;(wDOPMA). 
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Fig. 8: wDOPBA recognizing language L2 of Section [3771 



4 Closure properties and emptiness problem 

£(wOPBA) enjoys all closure and decidability properties necessary to perform model 
checking; thus thanks to their greater expressive power, we believe that they represent 
a truly promising formalism for infinite-state model-checking. 

In the first part of this section we focus on the most interesting closure properties of 
wOPAs, which are summarized in Table[T] where they are compared with the properties 
enjoyed by VPAs on infinite-length words. Binary operations are considered between 
languages with compatible OPMs. 





£(aiDOPBA) 


£(tjDOPMA) 


£ (t.iOPBA)=£ (toiOPMA) 


£(tjBVPA) |1| 


Intersection 


Yes 


Yes 


Yes 


Yes 


Union 


Yes 


Yes 


Yes 


Yes 


Complement 


No 


Yes 


Yes 


Yes 




No 


No 


Yes 


Yes 



Table 1 : Closure properties of families of w-languages. (Li •L2 denotes the concatenation 
of a language of finite-length words L\ and an w-language L2). 

Closure properties for wDOPBAs (under complement and concatenation with an 
OPL) and wDOPMAs are not discussed here because of space reasons, but they re- 
semble proofs for classical families of w-automata and can anyhow be found in llT3l . 
Closure properties for wDOPBAs under intersection and union are presented in Sec- 
tion |4]T] 

We consider in detail the main family wOPBA. This class is closed under Boolean 
operations between languages with compatible precedence matrices and under concate- 
nation with a language of finite words accepted by an OPA. The emptiness problem 
is decidable for wOPAs in polynomial time because they can be interpreted as push- 
down automata on infinite-length words: e.g. IS shows an algorithm that decides the 
alternation-free modal yu-calculus for context-free processes, with linear complexity in 
the size of the system's representation; thus the emptiness problem for the intersection 
of the language recognized by a pushdown process and the language of a given property 
in this logic is decidable. Closures under intersection and union hold for wOPBAs as for 
classical w-regular languages and can be proved in a similar way [13] . Closures under 
complementation and concatenation required novel investigation techniques. 

Closure under concatenation 

For classical families of automata (on finite or infinite-length words) the closure of the 
class of languages they recognize with respect to the operation of concatenation is tra- 
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ditionally proved resorting to a Thompson-like construction: given two automata that 
recognize languages of a given class, an automaton which accepts the concatenation of 
these languages is generally defined so that it may simulate the moves of the first au- 
tomaton while reading the first word of the concatenation and, once it reaches some final 
state, it switches to the initial states of the second automaton to begin the recognition of 
words of the second language. 

This construction, however, is not adequate for the concatenation of a language of 
finite words recognized by a classical OPA and an wOPL (recognized by an ojOPBA). 
In fact, a classical OPA accepts a finite word by reaching a final state and by emptying 
its stack thanks to the ending delimiter #. As regards the concatenation of a language 
recognized by an OPA and an w-language (accepted by an wOPBA) whose words are 
not ended by #, this condition is not necessarily guaranteed and it might be not possi- 
ble to complete the recognition of a word of the first language simulating the behavior 
of its OPA according to the acceptance condition by final state and empty stack. As 
an example, for a language Li c S* and an w-language L2 = {a"} with compatible 
precedence matrices such that all letters of the alphabet yield precedence to symbol a 
(i.e. b < a,'ib e E), the symbols still on the stack after reading words in L\ cannot be 
removed with flush moves before or during the parsing of the second word in the con- 
catenation, since the precedence relation < implies that the letters read are only pushed 
on the stack. Thus, the stack cannot be emptied after the reading of the first word, and 
this prevents to check if it actually belongs to the first language of the concatenation. 

After reading the first finite word in the concatenation, it is not even possible to de- 
termine whether this word is accepted by checking if in its OPA there exists an ongoing 
run on it that could lead to a final state by flush moves induced by a potential delim- 
iter #, since this control would require to know the states already reached and piled on 
the stack, which are not visible without emptying the stack itself. 

Closure under concatenation for the class of languages accepted by wOPBAs with a 
language of finite words accepted by an OPA could be proved similarly as for classical 
automata if it were possible to recognize finite words by an OPA without emptying the 
stack and without even performing any flush move induced by symbol # immediately 
after reading the word; in this way the acceptance could be completed even when the 
words of the second language prevent emptying the stack. 

To this aim, a possible solution is to introduce a variant of the semantics of the 
transition relation and of the acceptance condition for OPAs on finite-length words: a 
string is accepted if the automaton reaches a final state right at the end of the parsing 
of the whole word, and does not perform any flush move determined by the ending 
delimiter # to empty the stack; thus it stops just after having put the last symbol of x on 
the stack. Precisely, the semantics of the transition relation differs from the definition 
of classical OPAs in that, once a configuration with the endmarker as lookahead is 
reached, the computation cannot evolve in any subsequent configuration, i.e. a flush 
move C h Ci with C = {B\B2 . ■ .B„ , x#) and symboliB„) > y# is performed only if 
y s. The language accepted by this variant of the automaton (denoted as L) is the set 
of words: 



L{A) = {x I ([# qi] , xtt) h {y[a qp] , #),qi € I,qF e F,y e r,a e X U {#}} 
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We emphasize that, unlike normal acceptance by final state of a pushdown automaton, 
which can perform a number of e-moves after reaching the end of a string and accept if 
just one of the visited states is final, this type of automaton cannot perform any (flush) 
move after reaching the endmarker through the last look-ahead. 

Nevertheless, the variant and the classical definition of OPA are equivalent, as the 
following statements (Lemma[T|and Statement 1) prove. 

Lemma 1. Let Ai be a nondeterministic OPA defined on an OP alphabet {E, M) with 
s states. Then there exists a nondeterministic OPA A2 with the same precedence matrix 
as Ai and 0{\Z\s^) states such that L{Ai) — L{A2). 

To build such a variant A2 we need some further notation. Consider a word of finite 
length w which is preceded by a delimiter # but which is not ended with such a symbol. 
Define a chain in a word w as maximal if it does not belong to a larger composed 
chain. In a word of finite length preceded and ended by # only the outer chain (*w*> is 
maximal. 

An open chain is a sequence of symbols bo < ai = a2 - ■ • ■ - a„, for n > 1. 
The body of a chain {"x''), simple or composed, is the word x. A letter a € 2" in a word 
#w# with w e £* or #w with w e E'^ is pending if it does not belong to the body of a 
chain, i.e., once pushed on the stack when it is read, it will never be flushed afterwards. 

A word w which is preceded but not ended by a delimiter # can be factored in a 
unique way as a sequence of bodies of maximal chains w, and pending letters a, as 

# w - # w\a\W2a2 ■ ■ - Wnan where {"'-^w"') are maximal chains and each w,- can be 
possibly missing, with ao = # and V/ : 1 < i < n - 1 ai < fl,+i or a, = fl,+i. 

In general, during the parsing of word w, the symbols of the string are put on the 
stack and, whenever a chain is recognized, the letters of its body are flushed away. 
Hence, after the parsing of the whole word the stack contains only the symbols 

# fli fl2 ■ • ■ fln and is structured as a sequence of open chains. Let k be the number of 
open chains and denote by ai - ai,,ai^, . . .o^ their starting symbols, then the stack 
contains: 

# < aj^ — ai = 02 — ... < a/2 — flj^+i ■ . ■ <ai^ = ai^+\ . . . < ai^ = a,-,,+i ... = «„ 

When a word w is parsed by a classical OPA, the automaton performs a series of 
flush moves at the end of the string due to the presence of the final symbol #. These 
moves progressively empty the stack, removing one by one the open chains and, for 
each such flush, they update the state of the automaton on the basis of the symbols 
which delimit the portion of the stack to be removed, which correspond to the state 
symbols at the end of the current open chain and at the end of the preceding open chain. 
The run is accepting if it leads to a final state after the flush moves. 

As an example, the transition sequence below shows the flush moves of a classical 
OPA when it reaches the position of a„: 

{[#qi][ai/ q2][a2 ^3] ■ • ■ [a;,-! qii^l^i/ ?/2+i] • ■ ■ l^h-i qi,] ■ ■ ■ [a;,.-i ?'J[«4' ■ • ■ [«« 

I- {[#qi][ai/ q2][a2q3] . . . [a,2-i . . . [a,3-i ^,3] . . .[fl^-i = 5flush(?n+i,?;i)],#> 

* 

I- {[#qi][ai/ q2\[a2 ^3] ■ ■ -[oij-i qi2^i<^i2 ^i^+i] ■ ■ ■ [«i3-i = Snu!ih{qu,qh)],#) 
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h {[#qi][ai,' q2][a2 qj] ■ . • [a,,-i §,2 = 5flush(§,-,'?/2)]'#) 

I- <[#§! = ^flush(§2,?l)],#> 

A nondeterministic automaton that, unlike classical OPAs, does not resort to the de- 
limiter # for the recognition of a string may guess nondeterministically the ending point 
of each open chain on the stack and may guess how, in an accepting run, the states in 
these points of the stack would be updated if the final flush moves were progressively 
performed. The automaton must behave as if, at the same time, it simulates two snap- 
shots of the accepting run of a classical OPA: a move during the parsing of the string 
and a step during the final flush transitions which will later on empty the stack, lead- 
ing to a final state. To this aim, the states of a classical OPA are augmented with an 
additional component to store the necessary information. 

In the initial configuration, the symbol at the bottom of the stack comprises, along 
with an initial state q of the original OPA ^li, an additional state, say qf, which repre- 
sents a final state of ^li. The additional component is propagated until the automaton 
nondeterministically identifies the first pending letter, which represents the beginning 
of the first open chain; at this time the component is updated with a new state chosen so 
that there exists a move from it in 71 1 that can flush and replace the state at the bottom 
of the stack with the final one qf (notice that if the beginning letter of the word is not 
a pending letter - i.e., the prefix of the word is a maximal chain - after completing the 
parsing of the chain, the initial state q will be flushed and replaced on the bottom of the 
stack by a new state, say r, like in a classical OPA; in this case the last component added 
after reading the pending letter is chosen so that there exists a move in the graph of Ai 
that can flush and replace the state r with qf). Then, similarly, the additional compo- 
nent is propagated until the ending point of each open chain, until the conclusion of the 
parsing; while reading the pending letter that represents the beginning of the successive 
open chain the automaton augments the new state on the stack with a placeholder cho- 
sen so that there is a flush move in Ai from it that can replace the state at the end of the 
previous open chain with the additional component previously stacked, thus allowing a 
backward path of flush moves from each ending point of an open chain to the previous 
one, up to the final state initially stacked. If the forward path consisting of moves during 
the parsing of the string and this backward path of flush moves can consistently meet 
and be rejoined when the parsing of the input string stops, then they constitute an entire 
accepting run of the classical OPA. 

A variant OPA A2 equivalent to a given OPA Ai thus may be defined so that, af- 
ter reading each prefix of a word, it reaches a final state whenever, if the word were 
completed in that point with #, Ai could reach an accepting state with a sequence of 
flush moves. In this way, A2 can guess in advance which words may eventually lead 
to an accepting state of yii, without having to wait until reading the dehmiter # and to 
perform final flush moves. 

Example 4. Consider the computation of the OPA in Example [T] If we consider the 
input word of this computation without the ending marker #, then the sequence of 
pending letters on the stack, after the automaton puts on the stack the last symbol 
D, is # < U < M = x < 7Te_tpr < D. There are four open chains with starting sym- 
bols U, IX, TTejcpr, D, hence the computation ends with four consecutive flush moves 
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determined by the delimiter #. The following figure shows the configuration just before 
looking ahead at the symbol #. The states (depicted within a box) at the end of the open 
chains are those placeholders that an equivalent variant OPA should guess in order to 

find in advance the last flush moves qi 











qi 




qi 




qi 6 ^'i 



of the accepting run. 

<[# ] [U' ] [X' ^i] [M ] [;re„p/ 



W qi 



#) 



qi € Fi 








91 








qi 



The corresponding configuration of the variant OPA, with the augmented states, would be 
<[# 



] [u' 



qi ] [D' qu qi ] 



#) 



We are now ready to formally prove Lemma [T] 

Proof. Let yii = {E, M, Quh,Fu6i) and define A2 = {E, M, Q2, h, f 2, <^2> as follows. 

- Q2 = {B,Z,U}xi:xQi xQi, wherel' = i;u{#). 

Hence, a state (x, a, q, p) of A2 is a tuple whose first component denotes a nonde- 
terministic guess for the next symbol of the word to be read, i.e., a pending letter 
starting an open chain (Z), or a pending letter within an open chain ( t/), or a symbol 
within a maximal chain {B). The second and third components of a state represent, 
respectively, the lookback letter a read to reach the state, and the current state q in 
yii. To see the meaning of the last component, consider an accepting run of A\ and 
let q be the current state just before a mark move is going to be performed at the 
beginning of an open chain; also let r be the state reached by the mark move and 
s be the state on top of the stack when this open chain is to be flushed replacing q 
with a new state p. Then, in the same position of the corresponding run of A2, the 
current state would be (Z, a, q, p) e Q2 and state {x, a, r, s) 6 Q2 will be reached by 
A2, i.e., the last component p represents a guess about the state that will replace q in 

A\ when the starting open chain will be flushed. Hence we can consider only states 

1 

(Z, a, q, p) € Q2 such that r pi\\A\ for some r e Qi . In all other positions the 
last component of the states in Q2 is simply propagated. 

- I2 = {{X, #, q, ^F> I X E {Z, B], qe /[ , <?f € ) 

- F2 = {{Z,a,q,q} \ q e Qi,a e 1} 

- The transition function is defined as the union of two disjoint functions. 

The push transition function 52piish '■ QiX E 2^- is defined as follows, where 
p,q, r, s e Qi, a e E, and b,c e E. 

• Mark of a pending letter at the beginning of an open chain. \f a <b then: 

{ b q 

^2push ((Z, a, q, p), b) - < {x, b, r, s) \ x e {B, Z, U], q — > r, s => pinAi 

• Push of a pending letter within an open chain. If a ^ b then: 

62push({U,a,q,p),b) = l^{x,b,r,p} \ x € {B,Z, U],q-^ r in^lij 
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• Push/mark of a symbol of a maximal chain. 

(Jipush ((B, a, q, p), b) = |<B, b,r,p)\ q r'mA]^ 

Notice that the second and third components of the states computed by ^2push are 
independent of the first component of the starting state. 

The flush transition function (Jifiush '■ Qi^^Qi^ can be executed only within a 
maximal chain since there are no flush determined by the ending deUmiter: 

S2&\jsh{{B,b,q, s),{B,c,p,s}) = c, r, i) | x € {B,Z, U],q =^ rin^ij 
All other moves lead to an error state. 

The automata Ai and A2 recognize the same language, L(Ai) - L{A2). 

Let us prove first L{Ai) c L{A2). Let w e L{Ai) be a finite-length word. Then there 
exist a support <7 q'mA\ with q e Ii and q' e F\.lf w - wi a 1^202 ■ • ■ Wnfln e L(Ai) 
where a, are pending letters and w, are maximal chains, let k be the number of open 
chains that remain on the stack after the parsing of the last symbol in i7 of w, and let 
a,-, = fl,( be their starting symbols. Also, for every / = 2, . . . , «, let t(i) be the 

greatest index t such that i, < i, i.e., a, is within the tii)-th open chain starting with ai^^.y 
In particular, for i = n, if a„_i < a„ then i/^ = n, otherwise t{n) = k. 

Then the above support for w can be decomposed as 

~ tVl fll _ VV2 02 w„ a„ _ 

q^qo^ qi — > qi'^ q2 — > ...^ qn — > qn ^ Pk (4) 



qn=Pk^ Pk-1 =^ Pk-2 ^ . . . => P2 ^ Pi =^ Po = q 

where qi = if vv, = s for / - l,2,...,n. Notice that, for every t, qt^ is the state 
reached in this path before the mark move that pushes symbol a,^ on the stack; moreover, 
when the open chain starting with a,-, is to be flushed, the current state is pt and then 
state qi, is replaced with p,-i on top of the stack. 

W] 

Starting with state {Z,#,qi, po) if wi - s or with {B,#,qo, po) ^ {Z,#,q\, po) if 
wi s, an accepting computation of A2 can be built on the basis of the following facts: 

- Since g'l —^^1 and ==> poiaAi,tiienS2pu>ih({Z,#,qi,Po),ai) 3 {x,ai,'qi, pi) 
in A2 for X € [U,Z}. This is a mark move that can be applied at the beginning of 
the first open chain starting with ai, where pi is the guess about the state that wiU 
be reached before such open chain wiU be flushed. 

"ii _ III 

- In general, for every t, since — > and pi p,-i in Tli, then 
62{{Z,a^-i, qi,, Pt-\),a^) B {x,a^,'q^,pt) for x e [U,Z}. This is a mark move that 
can be applied at the beginning of the ?-th open chain starting with a,,, where pt is 
the guess about the state that will be reached before such open chain will be flushed. 
In particular, if i^ = n, we can reach state (Z, a„,qn,pk} which is final in A2 since 
qn = Pk- 
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- For every maximal chain w, of w (with / > 2) consider its support — > q',- 
in (|4|l. Then in A2 we have the sequence of moves "summarized" (with a natural 
overloading of the notation) by 62 {{B, a/-i,^,-i, /??(,)), w,) 3 {x, ai-i,qi, Pr(i)}, where 
X e {U, Z). Notice that the last component of the states does not change because we 
are within a maximal chain. In particular, during the parsing of w, the last compo- 
nent is equal to /:>,(,■), as guessed by the mark move at the beginning of the current 
open chain. 

- For every / i {11,12, ... ,ik}, since 5ipush(?/, a,) 9?/, then 52push((f/,fl/-i,?i,Pr(o)'«/) 
contains {x,ai,'qj,p,{i)), for x e {B,Z, U). In particular, if n /j, then f(n) = k and 
for i - nwe can reach state (Z, a„,q„,pit} which is final in A2 since q„ - pk- 

Thus, by composing in the right order the previous moves, one can obtain an accepting 
computation for wvi\A2- 

Conversely, to prove that L{A2) Q LiA^), consider a finite word w e L{A2). Then 
there exists a successful run of A2 on w. Let w be factorized as above; then the accepting 
run for w can be decomposed as 

u'l a\ wi a, 11',+ 1 u-'„ a„ 
TTo Pi > TTl -N^ P2 • ■ -Pi > TTi ...-^pn > TT,, 

where 7r,-,p, e Q2, p, = ;r,-i if w, = s, ttq e I2 and n„ e F2- By projecting this path 
on the third component of states tt, and p, (given by, say, pi and r, e we obtain a 
path in A\ labelled by w. This path is not accepting because there are open chains left 
on the stack that need flushing, but we can complete this path arguing by induction on 
the structure of maximal chains according to the definition of 52- More formally, one 
can verify that Q\ contains suitable states /:>, (for < / < n), r,- (for 1 < / < n), s, (for 
1 < f < A:), with r,- - pi^x whenever w, = e, such that the following facts hold. 

- TTo e h, hence ttq - {xo,#,po, sq), with po e I\ and sq e Fi; xq is B if wi + e, 
otherwise xq - Z. 

- TTo pi in A2 implies that the last component of state ttq is propagated through 
chain w\ without change; hence p\ — (Z, #, ri, sq) with po ri in Ai. 

- Pi — > 7Ti is a mark move of A2 at the beginning of an open chain, and this implies 
that the last component of ttj is new; hence we have ttj = {xi,ai, pi, s^} with 

ri — > pi and si sq in Ai; the first component is jci - B if W2 ^ s otherwise xi 
equals Z or U according to whether 02 starts an open chains or not, respectively, 

- The flush moves within tt, p,+i for 1 < / < 12, and the push moves within an 

a, 

open chain p, — > tt, for 1 < / < 12 propagate with no change the last component 
of States. Hence p/ - (f/, fl,_i, r,, ii) and tt/ = {xi,aj,pi,s\) with pi^i ^ ri — > 
in A\. The first component is Xj - Bifwj e otherwise x, = Z for / - 12- \ and 
X,- = f/ in the other cases. 

- p,, — > 7r,s is a mark move of A2 at the beginning of an open chain, and this 
implies that the last component of jiy is new; hence we have tt,, = {xi^Oj^, pi^, S2) 

Oh 

with r,, pi^_ and S2 si in ^li. The first component is x,^ = B if vvj^ s 
otherwise xi equals Z or U according to whether a,, + 1 starts an open chains or 
not, respectively. 
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- Similarly for the following moves in the run. 
In general, we get 

Pi = <yi,ai-i,ri,s,(i)) 
Ki = {Xi,ai,pi, 

TTi, = {Xi„ai,,pi,,St) 

with r,- Pi, s, =^ s,_u pi_i ^ i-j in Ai 
and yi e lZ,U],Xi e {B,Z,U] 



for every i - 1,2, . . . ,n, 
for every / i {/'i ,12, ■ ■ ■, ik] 
for every / = 1,2,. ..,k, 

for every i and t. 



By convention, ao = #• For i = nwe have n = or t{ri) = k, hence ;r„ = {x„, a„, p„, i^), 
and pn = Sk and x„ =Z since ;r„ € F2. Thus, in Ai there is an accepting run 

Wl ai W2 at Wi+i w„ a„ 

hs po-^n — * pi'^r2...ri — * Pi ■■■^rn — * p„ = Sk 



'Ik 'k-l 

Pn = Sk^> Sk-1 



S2 



Si 



So e Fi 



and this concludes the proof of the lemma. 



The next Statement, although not necessary to prove closure under concatenation of 
£(a)OPBA), completes the proof of equivalence between traditional and variant OPAs, 
showing how to define, for any variant OPA, a classical OPA which recognizes the same 
language. 



Statement 1 Let A2 be a nondeterministic OPA defined on an OP alphabet (Z, M) with 
s states. Then there exists a nondeterministic OPA A\ with the same precedence matrix 
as A2 and 0{\E\^s) states such thatL(^i) = L{A2). 

Proof. Let A2 = {E, M, Q, I, F, 6} and consider, first, an equivalent form for the au- 
tomaton A2, where all the states are simply enriched with a lookahead and lookback 
symbol: A2 = {I^, M, Q2,h,F2,62) where 

- Q2 - Z y. Q X E, where 2" = (iT U {#}), i.e. the first component of a state is the 
lookback symbol, the second component of the triple is a state of A2 and the third 
component of the state is the lookahead symbol, 

- /2 = {#} X / X {a € 1" I M#a + 0) is the set of initial states of A2, 

- F2 = ({#) U {fe e i; : /7 > #)) X F X {#) 

- and the transition function ^2 : 22 x (2" U Q2) — > 2^^ is defined in the following 
natural way 

• 52push«fl, q, b), b) = {{b, p,c)\p& 5push(^, b) A Mab e {<, =} A Mbc + 0}, 
Va e r, e i:, ^ e 2 

• 52flush«ai,g'l,a2>,<^l,^2>2» = {01,?3,a2> I ^3 e <5flush(^l,^2) A Maia2 = > 
A Mb,a2 * 0}, 

yaua2,b2€E,\/bi &E,'iquq2 e Q. 
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It is clear that L(A2) = L(A2). Furthermore, the final states of A2 cannot be reached 

{bi , q2, ^2) 

by flush edges: in fact, if there exists a transition (ai , qi, 02) {ai , qj, #) 

towards a final state (ai , qj,, #), then the third component of the flushed and of the 
reached final state must be equal by definition of the transition function, i.e(ai , qi, a^i) = 
{ai , qi, #>. But this flush transition cannot be performed by a variant OPA, which stops 
a computation right before reading the delimiter #, when the parsing of the word ends. 

Hence, one may always refer to a variant OPA assuming that in its graph there are 
no flush moves towards final states. 

It is then possible to describe an automaton OPA Ai equivalent to the variant OPA 
A2 (or A2). 

A\ = {E,M, Qi,h,Fi,5\) is defined as A2 but it is enriched with an additional 
state, which is the only final state of A] and which is reachable through a flush edge by 
all final states of A2. Basically, its role is to let Ai empty the stack after parsing a word 
that is accepted by A2. 

- 2l = 62 U {^accept} 

- /i = /2 U {^'accept} if /2 H F2 9^ or 7i = /i Otherwise 

- ^1 = {^accept} 

- The transition function 6i equals 62 on all states in 22; in addition Ai has depart- 
ing flush edges from the final states in F2 to ^'accept and ^accept has no outgoing 
push/mark edge but only self-loops flush edges. 

The push transition function ^ipush : 2i x 2" — » 2^' is defined as <5ipush(?,c) = 
^2push(5, c), e Q2, c € Z, whereas ^ipushC^accept. c) leads to an error state for any 
c. 

The flush transition dmnsh : 2i X 2i — > is defined by: 

Sifiush(q,p) = S2Msh(q, p),yq, P ^ Qi 

^lflush(^,p) = qaccept,^q e (F2 U {q'accept}), P € Q2 

The two automata recognize the same language, L(A\) = L(A2). 

First of all, L{A\) c L{A2)'. in fact, if the OPA A\ recognizes a word, then it is 
either the empty word and thus ^accept s h and also A2 has a successful run on it, or Ai 
recognizes a word w e and there exists a run S of A\ which ends in the final state 
^accept, emptying the stack. Notice that ^accept is reached by a flush move from a state in 
F2, say qf e F2: 

w p^Qi , 

S : qo € I2 qf => ^accept(=> ^accept) 

and qf itself is reached exactly when the parsing of the word w is finished, since, as 
said before, a state in F2 cannot be reached by flush moves. This condition is necessary 
to avoid the presence of sequences of flush moves from non accepting states towards 
final states. Then the path from qo to qf, which follows the same state and edges as S , 
represents a run of A2 which ends in a final state qf right after the parsing of the whole 
word, thus accepting w. The direction from right to left L{Ai) 3 L{A2) derives easily 
from the fact that, if A2 accepts a word along a successful run, then Ai recognizes the 
word along the same run, possibly emptying the stack in the final state (^accept- ^ 

Given the variant for OPAs on finite words, it is possible to prove the closure under 
concatenation of the class of languages accepted by a»OPBAs with a language of finite 
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words accepted by an OPA, as the following theorem (Theorem 121) states. Notice that 
its proof differs from the non-trivial proof of closure under concatenation of OPLs of 
finite-length words f6l, which, instead, can be recognized deterministically. 

Theorem 2. Let Li Q X* be a language of finite words recognized by an OPA with 
0PM Ml and s \ states. Let L2 C be an u-language recognized by a nondeterministic 
wOPBA with 0PM M2 compatible with Mi and S2 states. 

Then the concatenation Li ■L2 is also recognized by a aiOPBA with 0PM M3 2 Mi UM2 
and + l^s^) states. 

Proof. Let yii = (2", Mi, Qi, ^1, ^^i, be a nondeterministic OPA which recognizes 
language Li and let A2 = {Z,M2, Qi,li,F2,5'i) be a nondeterministic wOPBA with 
OPM M2 compatible with Mi which accepts La- Suppose, without loss of generaUty, 
that Qi and Q2 are disjoint. 

To define an automaton wOPBA A^, which accepts the language Lj • L2, we first 
build an automaton OPA in the variant form A' \ - {E, Mi , Q'^ , /J , , 6\ ) such that 
UA'i)^L(Ai). 

The automaton A^, may recognize the first finite words in the concatenation Li ■ L2 
simulating A' i : during the parsing of the input string, if A' i reaches a final state at the 
end of a finite-length prefix, then it belongs to Li and A3 may immediately start the 
recognition of the second infinite string without the need to perform any flush move 
to empty the stack. From this point onwards, then, A^ may check that the remaining 
infinite portion of the input belongs to L2, behaving as the wOPBA A2. Notice, however, 
that as it happens for operator precedence languages of finite-length words |6|, the 
strings of the concatenation of two OPLs may have syntax trees that significantly differ 
from the concatenation of the trees of the single words: the trees of the strings of the two 
languages may be merged, according to the precedence relations between the symbols 
of the words, in a completely new structure. From the point of view of the parsing 
of a string in Li ■ L2 by an automaton, the joining of the trees of two words in Li 
and L2 may imply that the recognition and reduction by flush moves of a subtree with 
branches in a word in Li have to be postponed until the parsing of the other branches 
in the word in L2 has been completed. Therefore, A3 cannot merely read the second 
infinite word performing the same transitions as A2, but it is still possible to simulate 
this (jjOPBA keeping in the states some summary information about its runs. In this 
way, while reading the second word in the concatenation, whenever At, has to reduce 
a subtree which extends to the previous word in Li and thus it has to perform a flush 
move that involves the portion of the stack piled up during the parsing of the first word, 
it can still restore on the stack the state that A2 would instead have reached, resuming 
the parsing of the second word thereon as in a run of A2. 

In particular, the automaton A3 is defined as follows. Let £ - E U {#] and A3 = 
{E, M3 , Q3 , 13 , F3 , ^3 ) where: 

- M3 2 Ml U M2 and may be supposed to be a total matrix, for instance assigning 
arbitrary precedence relations to the empty entries, so that the concatenation of the 
languages Li and L2 is well defined. 

- Qi - Q'l U !■ X 02 X {Q2 U {-)), i.e. the set of states of A3 includes the states of 
A'\, while the states of A2 are extended with two components. The first component 
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is a lookback symbol, the second component is a state of Q2 and the third repre- 
sents, as in the construction for deterministic OPAs |I9]), the state with the marked 
symbol that, when the current input letter is read in a run performed by A2 on the 
infinite substring, is the last marked symbol on the stack. Storing this component 
is necessary to guarantee that, whenever the automaton At, has to perform a flush 
move towards states piled in the stack during the recognition of the first word in the 
concatenation, it is still possible to compute the state that A2 would have reached 
instead. 

This third component is denoted '-' if all the preceding symbols in the stack have 
been piled during the parsing of the first word of the concatenation (thus the stack 
of A2 is empty). 

- h - {(#, /?(), -) I e /2} if e e Li or 73 = /J otherwise 

- F3 = r X F2 X 22 

- The transition function ^3 : Q}, x (X* U Q^,) 2^^ is defined as follows. The push 
transition (53push : Qj x X ^ 2^^ is defined by: 

• 53push(?i,c) = 5jp^^^(g'i,c), V^i e Q[,c er, i.e. it simulates yi' 1 on Q\ 

• Sipush(qi,c) = {{#,po,-) I po e h},'^qi e Q[,cei: : 3qf e F[ s.t. 

i.e. it reaches the initial states of A2 after the recognition of a word in Li 



for a e i:,c ei:,p e Q2, r e (Qo U {-]) 
The flush transition 53fiush ■ Q3 x Qi 2^' is defined by: 

• S^nushiquPi) = S'inush^q\,P\)yq\,Pi e g'j, i.e. it simulates yi' 1 on Q\ 

• (53flush«#, P, -), q) = <#, P, ->, with peQ2,qeQ\ 

• 53flush«fli,/5i,n = P2),{a2,P2,r2)) = {{a2,q,r2) | q e S2nu!,h(pu Pi)], 
where a 1 e i7, 02 € 2" 

• 53flush«a, P, r), q) = {<#, s,-)\s & 52flush(p, '")!, for a & E, p,r e Q2,q e Q\ 

i.e. whenever the precedence relations induce a merging of the subtrees of the 
words of the concatenation, A^ restores the state s at the bottom of the stack of 
A2 from which a run of A2 will continue. 

It is clear that the wOPBA A3 recognizes L\ ■L2, thus the class of languages accepted 
by wOPBA is closed under concatenation on the left with languages recognized by 
OPAs. □ 

Closure under complementation 

Theorem 3. Let M be a conflict-free precedence matrix on an alphabet E. Denote by 
Lm £ E" the oj-language comprising all inflnite words x e E'^ compatible with M. 
Let L be an cj-language on E that can be recognized by a nondeterministic wOPBA with 
precedence matrix M and s states. Then the complement of L w.r.t Lm is recognized by 
an wOPBA with the same precedence matrix M and 2^^^'^ states. 
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Proof. The proof follows to some extent the structure of the corresponding proof for 
Biichi VPAs IT], but it exhibits some relevant technical aspects which distinctly charac- 
terize it; in particular, we need to introduce an ad-hoc factorization of w-words due to 
the more complex management of the stack performed by wOPAs. 

Let A - (Z, M, Q, I, F, 6} be a nondeterministic wOPBA with \Q\ - s. Without loss 
of generality A can be considered complete with respect to the transition function 6, i.e. 
there is a run of A on every w-word on X compatible with M. 

In general, a sentence on 2"" can be factored in a unique way so as to distinguish 
the subfactors of the string that can be recognized without resorting to the stack of the 
automaton and those subwords for which the use of the stack is necessary. 
More precisely, an <y-word w e can be factored as a sequence of chains and pending 
letters w - wi W2W3 . . . where either w,- = a; e 2" is a pending letter or w, - fl,ifl/2 ■ ■ ■ fl/n 
is a finite sequence of letters such that ('' w/"'*''+') is a chain, where It denotes the last 
pending letter preceding w,- in the word and firsti+i denotes the first letter of word w,+i . 
Let also, by convention, ao = # be the first pending letter. 

Notice that such factorization is not unique, since a string w, can be nested into 
a larger chain having the same preceding pending letter. The factorization is unique, 
however, if we additionally require that w, has no prefix which is a chain. 

As an example, for the word w - <a <c> b <a> d> with precedence 

relations in the OPM a > b and b < d, the unique factorization is w = wibwiW4b . . ., 
where bis a pending letter and (^ac*), {''a''), (^t/*) are chains. 

Define a semisupportfor the simple chain {""0102 . . . a,""*' ) as any path in A of the form 

fll fln 90 

qo — > qi — > • ■ • — * qn-i — * qn ?n+i yj) 

A semisupportfor the compoiet/c/ja/n, with no prefix that is achain, ("''01X102 ■ • -fln^n""*') 
is any path in A of the form 

fll , an fl„ .v„ , qo 

qo — ^ qi-^ q\ — > ■ ■ • — ^ qn^ q„^ qn+i (6) 

where, for every / : 1 < i < n: 

fl/ -^"1 

- if X, ^ e, then — > q; q'. is a support for the chain {"'x"'*'}, i.e., it can be 

fl/ <?/ 
decomposed as — > qi q'.' => q'.. 

- if Xi - s, then q'^ - qi. 

Unlike the definition of the support for a simple (EquationlTJ and a composed chain 
(Equation lU, in a semisupport for a chain the initial state ^0 is not restricted to be the 
state reached after reading symbol oq- 

Let X e X* he such that ("x''} is a chain for some a, b and let T(x) be the set of 
all triples (q, p,f) e Q x Q x {0, 1} such that there exists a semisupport q p in A, 
and / = 1 iff" the semisupport contains a state in F. Also let T be the set of all such 
T(x), i.e., T contains set of triples identifying all semisupports for some chain, and set 
PR - £ U 7. The pseudorun for w in ^1 is the w-word w' - yiyiy-i . . . e PR" where 
yi - Gj if Wj - o/, otherwise yi = T(wi). 

For the example above, then, w' - T{ac) b T{a) T(d) b . . .. 
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We now define a nondeterministic Biichi finite-state automaton Ar over alphabet 
PR that recognizes the pseudorun w' of any w € L(A). Ar has all states of A and 
transitions corresponding to ^I's push transitions but it is devoid of flush edges (indeed 
they cannot be taken by a regular automaton without a stack). In addition, for every 
S e T it is endowed with arcs labeled S which link, for each triple (q, p, f)m S , either 
the pair of states q, p or q, p' if f - 1, where p' is a new final state which summarizes 
the states in F met along the semisupport q p and which has the same outgoing 
edges as p. 

Notice that, given a set 5 e T, the existence of an edge S between the pairs of states 
q, p in the triples in S can be decided in an effective way. 

The automaton Ar built so far is able to parse all pseudoruns and recognizes all 
pseudoruns of w-words recognized by A. However, since its moves are no longer de- 
termined by the OPM M, it can also accept input words along the edges of the graph 
of A which are not pseudorun since they do not correspond to a correct factorization 
on PR. This is irrelevant, however, since the aim of the proof is to devise an automaton 
recognizing the complement of L(A), and all the words in Lm\L{A) are parsed along 
pseudoruns, which are not accepted by Ar. If one gives as input words only pseudoruns 
(and not generic words on PR), then they will be accepted by Ar if the corresponding 
words on £ belong to L(A), and they will be rejected if the corresponding words do not 
belong to L{A). Given the Biichi finite-state automaton Ar (which has 0{s) states), one 
can now construct a deterministic Streett automaton "Br that accepts the complement of 
L(Ar), on the alphabet PR. WBr receives as input words on PR only pseudoruns, then 
it will accept only words in Lm\L(A). The automaton 23^ has 2'^*'* s) states and 0{s) 
accepting constraints [T6l. 

Consider then a nondeterministic transducer wOPBA 25 that on reading w gener- 
ates online the aforementioned pseudorun w', which will be given as input to 25^. The 
automaton 23 nondeterministically guesses whether the next input symbol is a pending 
letter, the beginning of a chain appearing in the factorization of w, or a symbol within 
such a chain, and uses stack symbols Z, ±, or elements in T, respectively, to distinguish 
these three cases. 

In order to produce w', whenever the automaton reads a pending letter it outputs the 
letter itself, whereas when it ends to recognize a chain of the factorization, performing 
a flush move towards a state with ± as first component, it outputs the set of all the pairs 
of states which define a semisupport for the chain. Thus, the output w' produced by B 
is unique, despite the nondeterminism of the translator 

Formally, the ti-ansducer wOPBA 23 - (Z, M, Qb, h, Fb, PR,6B,riB) is defined as 
follows: 

- Qb - ^ y- {{Z, ±1 U T) where 2" = 2* U {#). The first component of a state in Qb 
denotes the lookback symbol read to reach the state, the second component rep- 
resents the guess whether the next symbol to be read is a pending letter (Z), the 
beginning of a chain (±), or a letter within such a chain w, (T e T). In the third 
case, T contains all information necessary to correctly simulate the moves of A 
during the parsing of the chain w,- of w, and compute the corresponding symbol y,- 
of w' . In particular, T is a set comprising all triples (r, q, v) where r represents the 
state reached before the last mark move, q represents the current state reached by 
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A, and v is a bit that reminds whether, while reading the chain, a state in F has 
been encountered (as in the construction of a deterministic OPA on words of finite 
length |9|, it is necessary to keep track of the state from which the parsing of a 
chain started, to avoid erroneous merges of runs on flush moves). 

- iB^m±},{#,z}]. 

- Fb = [{a,±},{a,Z} I fl el"}. 

- The transition function and the output function are defined as the union of two dis- 
joint pairs of functions. Let a e E,b,c e E,T ,S e T. The push pair (^Bpush, '/Bpush) : 
2b X 2" — > ypiQB X PR*) is defined as follows, where the symbols after X denotes 
the output of the move of the automaton. 

• Push of a pending letter. 

{SBpush,r]Bpush) «a,Z},b) = {{b,±} i b, {b,Z) I b] 

• Mark at the beginning of a chain of the factorization. \f a <b then: 

(^Bpush, '7Bpush) {{a, -L), b) = \{b, T) [ s] 

where T = (<^, p,v)\qeQ,pe 6pi,,h{q, b),v= liff p e f] 

• Push within a chain of the factorization. 

(^Bpush, '7Bpush) ({a, T), b) = {{b,S} i e] where 



S = {{t,p,v) I 3{r,q,^} E T s.t. f = 



q if a < b 
r if a ^ b ' 



^ if piF , ,^ 

lifpeF ' 



The flush pair (^Bflush, '7Bflush) ■ QbxQb '?f(Qb x PR*) is defined as follows. 
• Flush at the end of a chain of the factorization. 



(^Bflush, '7Bflush)«^, 7^), (a, ±}) = {{a, ±) i R, (a, Z) i R] where 

R = |(r, p, v) I 3{r, q, ^) € T, s.t. p 6 S^^^hiq, r), v = 
• Flush within a chain of the factorization. 



^ifpiF 
lifpeF 



{(^Bflush, '7Bflu.sh)((^, 7^), (c, S}) = |<c, R) i e] where 
R^\{t, p, v) I 3<r, q, ^) e T, 3{t, r,0eS s.t. p e dnu.hiq, r), v = 



^ if p€F 
lifpeF 

An error state is reached for any other case. In particular, no flush move is defined 
when the second state has Z as second component, nor when the first state has Z or 
± as second component, as consistent with the meaning of stack symbol Z and ±. 

In the end, the final automaton to be built, which recognizes the complement of 
L = L(A) w.rt Lm, is the wOPBA representing the product of 25r (converted to a Biichi 
automaton), which has 2''**'°s'^' states, and 25, which has \Qb\ - 2'^'''* states: while 
reading w, 23 outputs the pseudorun w' of w online, and the states of 23« are updated 
accordingly. The automaton accepts if both 23 and 23^ reach infinitely often final states. 
Furthermore, it has 2''*''* states. □ 
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4.1 Closure properties of £/(<yDOPBA) under intersection and union 

The class of languages accepted by wDOPBAs is closed under intersection and union. 

Closure under intersection 

Theorem 4. Let L\ and L2 be cj-languages that can be recognized by two wDOPBAs 
defined over the same alphabet E, with compatible precedence matrices Mi and M2 
and s\ and S2 states respectively. Then L — Li r\L2 is recognizable by a wDOPBA with 
0PM M = Ml n M2 and 0(siS2) states. 

Proof. The proof derives from the analogous proof of closure with respect to intersec- 
tion of languages recognized by wOPBAs described in lfT3l . In fact the wOPBA which 
accepts the intersection of two languages Li and L2 recognized by two wOPBAs Ai and 
A2 with compatible OPMs described in that proof is deterministic if both the automata 
Ai and A2 are deterministic. □ 

Closure under union 

Theorem 5. Let L\ and L2 be u-languages that can be recognized by two wDOPBAs 
defined over the same alphabet E, with compatible precedence matrices Mi and M2 
and S\ and S2 states respectively. Then L — Li U L2 is recognizable by an wDOPBA 
with 0PM M = Mi U M2 and 0(siS2) states. 

Proof. LetAi = {E, Mi, Qi,qou Fi,6i}jind A2 = (l", M2_, ^2, §02, ^2, §2) be two 
wDOPBAs accepting the languages L(Ai) - Li and L(A2) = L2 and with compatible 
precedence matrices Mi and M2. Suppose without loss of generality that Qi and Q2 are 
disjoint. Let \Qi\ - si and \Q2\ - S2- 

Since Mi and M2 are compatible, then M = Mi U M2 is conflict-free and the two 
wDOPBAs may be normalized completing their precedence matrix to M = Mi UM2 (see 
e.g. the normalization described in [13 1). The normalization preserves the determinism 
of the automata and keeps their sets of states disjoint. 

The automata may be, then, completed as regards their transition function, so that 
there is a run on their graph for every w-word in Lm 1 1 3 | . The completed automata 
Ai = {E,M = Ml UM2,ei,^oi,/^i,5i) andyi2 = {E,M = Mi U M2, ?02, ^^2, ^2) 
are still deterministic with disjoint state sets and recognize the same languages as Ai 
and A2, i.e. L{Ai) = Li and L(A2) = ^2- Furthermore, \Qi \ - 0{si) and \Q2\ - 0(s2). 

An wDOPBA A3 which recognizes Li U L2 may then be defined adopting the usual 
product construction for regular automata: A3 = {E,M = Mi U M2, ?03, ^3, ^3) 
where: 

- Qi = Gi X Q2, 

- ?03 = (loi^lm), 

- F1XQ2UQ1XF2 
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- and the transition function 63 : Q^x (2"U Qb) Q3 is defined as follows. The push 
transition 53push ■ Qi x ^ Q3 i& expressed as: 

Sipush({quq2),a) - (dipashiqu a), 62push(q2, a)) 

V^i e Qi,q2 e Q2,a eS. 

The flush transition (JbausH ■ Q^x ^ is defined as: 

Sinu!.h((q\,q2),(p\, Pi)) = (5my,^b{q\, p\),S2fiush{q2, P2)) 
'iqupi e Quq2,p2 e Q2 

The wDOPBA A3 simulates Ai and A2 respectively on the two components of the 
states, and accepts an w-word iff" there is an accepting run on it for at least one of the 
two automata. 

The definition of the transition function is sound because the automata Ai and A2 
have the same precedence matrix, thus they perform the same type of move (mark/push/ 
flush) while reading the input word; furthermore, they are both complete w.r.t their 
transition function and none of them may stop a computation while reading a string, n 

5 Conclusions and further research 

We presented a formalism for infinite-state model checking based on operator prece- 
dence languages, continuing to explore the paths in the lode of operator precedence 
languages started up by Robert Floyd a long time ago. We introduced various classes 
of automata able to recognize operator precedence languages of infinite-length words 
whose expressive power outperforms classical models for infinite-state systems as Vis- 
ibly Pushdown w-languages, allowing to represent more complex systems in several 
practical contexts. We proved the closure properties of wOPLs under Boolean opera- 
tions that, along with the decidability of the emptiness problem, are fundamental for 
the application of such a formalism to model checking. For instance, with reference to 
Example 12] imagine that one builds a specialized system that includes only procedures 
of type a and where interrupts of lowest level are disabled when there is any pending 
calla'. once having built a new model A for such a system she can automatically verify 
its compliance with the more general one A by checking whether L(A) c L{A). 

Our results open further directions of research. A first topic deals with the investiga- 
tion of properties and fields of application of OPAs and wOPAs as transducers, as they 
may e.g. translate tagged documents written in mark-up languages (as XML, HTML) 
into the final displayed (XML, HTML) page, or they may translate the traces of op- 
erations of do-undo actions performed on diff'erent versions of a file into an end-user 
log or document. Thus, it might be possible to define a formal translation from struc- 
tured or semistructured languages or patterns of tasks and client behaviors into suitable 
final-user views of the model. 

A second interesting research issue is the characterization of wOPLs in terms of 
suitable monadic second order logical formulas, that has already been studied for op- 
erator precedence languages of finite-length strings [11]. This would further strengthen 
applicability of model checking techniques. The next step of investigation will regard 
the actual design and study of complexity issues of algorithms for model checking of 
expressive logics on these pushdown models. We expect that the peculiar features of 
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operator precedence languages, as their "locality principle" which makes them suitable 
for parallel and incremental parsing []2J3] and their expressivity, might be interestingly 
exploited to devise efficient and attractive software model-checking procedures and ap- 
proaches. 
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